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Abstract 


Sedge organgzing heuristics are often adapted to 
maintain a sequential list in near optimal order when the 
frequency of accessing each element is not known in advance. 
An account of the existing schemes iS given, and _ the 
self-organizing heuristics are modified to handle the case 
Of a dynamic )fsxed-size list> where only a “subset “of 
elements 1S maintained in the sequential list at a given 
time. An application of this modified scheme is the 
construction of paging algorithms for managing storage 


hierarchies. 


This thesis examines the performance of the modified 
scheme, with respect to the average time required to search 
for an element. The analysis of the performance involves 
modelling the stream of requestS aS a Markov chain. The 
class of modified move to front heuristics is studied in 
detail. Empirical evidence, including numerical examples and 
Simulation results, suggests that the class of modified 
transposition heuristics is asymptotically more efficient 


than the corresponding move to front heuristics. 
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I. Introduction 


Suppose we havera Seb of m records, RipRoy.e. phn whien 
1S arranged in some arbitrary serial order 7, where R; is in 
DOS TT On! we) peor Ion, TAt Teachweime. aunstant. t,, toma pa 
record is asked for. To fetch the requested record R,, the 
linear list is searched sequentially, starting from position 


ie Gun tumerehe iw ecorders stound limepos Meron er Cin). 


Assume that each record R, has a probability p, of 
being selected, and that p,; satisfies the following 
conditions 


a howe) ECOG alia 


3. Di remains unchanged and independent at all times, 
1.e€., Successive requests are probabilistically 
independent of each other and of the ordering of the 
records (INDEPENDENCE ASSUMPTION). 
In terms of the expected number of searches required to 
retrieve a record, a random arrangement would be expected to 
perform poorly, as it requires (nt+1)/2 searches on the 
average. Unless all the records have an equal chance of 
being accessed, a rearrangement of the records could reduce 
the expected search cost. The obvious approach would be to 
order the records by non-increasing retrieval probabilities. 
However, in practice, these probabilities are rarely known a 
priori, and optimal ordering of the records cannot be done 


in advance. Under this circumstance, heuristics that 


dynamically rearrange the records are often used to reduce 
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expected search cost. The idea behind these heuristics is to 
utilize previous accesses as a basis for rearranging the 


records into a less costly ordering. 


One approach would be to count the number of requests 
for a record, and rearrange the records in non-increasing 
order by request counts. Given enough time, by the law of 
large numbers, records with higher retrieval probabilities 
will have higher access frequencies, and hence optimal 
ordering could be reached. However, the major problem with 
this approach is the extra amount of storage incurred to 
keep the counts. The cost of maintaining the counts is often 
prohibitive (see Knuth[10]), and one has to resort to memory 


free self-organizing heuristics. 


A. Memory Free Self-Organizing Heuristics 

The two most frequently mentioned memory free 
self-organizing heuristics are the move to front and the 
transposition schemes. These two heuristics have been 
Studied by Bitner[1], Burville and Kingman[2], Gonnet, Munro 
and Suwanda[5], Hendricks[6,7,8], Knuth[10], Lam, Leung, and 
Siul[11], McCabe[12], Rivest[13], and Tanenbaum[14]. The 
basic approach of these two heuristics is to perform a 
permutations Upon “8the;) “existang order of the records every 
time a record is requested, according to some specific rules 


which will be explained later. 
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Define aeaiselii-organiazing es scheme “to o'be =a “set “of 
permutations, {7;, 1Sisn}, where 1r, is the permutation 
performed =when record in» position 21 1S requested. After 
appuication of 7,, the record in. position 3} will be moved to 


position 7,(j). The two schemata are given below. 


Move to Front Heuristics (MTF) 


ste ie 1 
GQ) = (4, : 
J, J 


VSIA 


in@.,) When) as record in pesitron 1 is accessed, this 
requested record will be placed in position 1, while records 
in position. 7) “through 4-1 are moved back one position to 
make room for it. The ordering remains the same if the 
requested record heads the list. 


Transposition Heuristics (TR) 
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i.e., a requested record is exchanged with the one preceding 
it. Similarly here, requesting a record in position 1 leaves 
the ordering unchanged. 


McCabe[11] has shown that asymptotically the expected 


search time, u, for the MTF scheme satisfies 
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Burville and Kingman{2] have shown that, 1£ p; 2 p2 2 ps 
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where m is the expected search cost for the optimal ordering 
R,R2R3...R,. The above inequality shows that MTF never does 


more than twice the work done with the optimal ordering. 


The MTF scheme does exhibit considerable savings over 
random arrangement. AS an example, Knuth[10] showed that, if 
the retrieval probabilities obey Zipf's distribution, i.e. 
By = 1/11, )5 where oH, is the nth harmonite “number ( HH, = 
Dee ee) ee MTr) = 2ny logsn. Thisecost. 1S substantially 
better than (n+1)/2, and is about ln 4 = 1.386 times as many 


comparisons as would be obtained in the optimal arrangement. 


The expected search cost eys the transposition 
heuristic can be evaluated for a specific retrieval 
distribution (see Hendricks[8]). However, no concise general 
expression is given. Rivest[13] has shown that the expected 
search cost, u(TR), is always more efficient than wu(MTF), 
except when n=2 or all p,;'S are equal. Rivest also 
conjectures that TR is asymptotically more efficient than 
any other memory-free self-organizing heuristics, for any 
retrieval probability distribution. Various evidence to 
Support this conjecture is presented; however, no direct 


proof has been given. 


B. Self-Organizing Heuristics with Limited Storage 


Gonnet, Munro, and Suwanda[5] later showed that, with 
the use of a very small amount of storage to remember a few 


previous requests, expected search cost could be further 
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reduced BThegnot1on ole kK in =aerow neuristic#fis introduced. 
The PlESe approach is Lo apply the transposition 
({move-to-front or any other) heuristic only if the same 
element 1S accessed k times in a row. This approach would 
use log2(nk) bits of extra storage instead of the O(n) or 
more bits required for the counter scheme. This is known as 
the simple kK heuristic. A slight variation of this scheme, 


known as the batched k heuristic, will be explained later. 


Simple k Heuristics 


The expected search cost of the simple k MTF heuristic 
rSsqivenin [5 


kam k- 1 
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ie yo U4 Dipset Dire pi ba 
t=0 t=0 
Lteeusemal SOueShowiethac jp. (MTP) <p o(MTE = for ke 2. an The 
following example, taken directly from the paper, 


demonstrates the performance of the scheme 
UME) a I 6602s OpE 
eS (MOE) lee? 60 Opt 


u,(MTF) < 1.22788 Opt 


Although the exact search cost for the simple k 
Heuristic with transposition “rule, (TR); “has not been 
obtained, the cost has been shown to be better than the 
Simple k move to front heuristics. The expected search cost, 


u,(TR), is also shown to decrease as k increases. 
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Batched k Heuristics 


This is a slight variation of the simple k heuristic in 
that requests are batched into groups of k elements, and 
reorganizing occurs only when the k requests ina batch are 
for the same element. In other words, when an element is 
accessed k times in a row, the transposition (or move _ to 
front) rule may or may not be applied, depending on whether 


the k accesses belong to the same batch. 


In [5], the expected search cost for the MTF scheme 


under the batched k heuristic 1S given as 
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ko 2. ee meetin addition eethesbacched? kmheurvstic aseshown ete 
perform better than the corresponding simple k heuristic 

u', (MTF) < yu, (MTF) 
Again, the following example is taken directly from _ the 
paper 
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The expected search cost Or the batched k 
GranspoSitione® heuristic 'rsmtalsoitshowny Stow satirsty the 
following inequalities 
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C. Convergence Rate of Self-Organizing Heuristics 


The primary measure of effectiveness of a heuristic has 
been the asymptotic search cost. A natural concern is_ the 
rate of convergence of these heuristics to their asymptotic 
efficiencies. Bitner[1] shows that although the 
transposition rule is asymptotically better than move to 
front rule, the latter converges to itS asymptotic value 
more rapidly. This observation suggests that the move to 
front heuristic may be preferable if fewer requests will be 
made to the list, or the probability of access keeps 


varying. 


According to Bitner, the first measure of convergence 
is overwork , which is defined as the area between the cost 
curve and its asymptote (for details refer to [1]). The 
"steeper" the cost curve is, the smaller the overwork will 
be. The overwork for the conventional MTF rule is given as 
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No general form has been found for the overwork for the 


transposition rule. However, there are good reasons to 
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believe that the transposition rule does converge more 
SsTowlyoeinsthe move to front rule, records with higher 
ACCESS DUCObGDII ity — rise quickly to the top or the i1st., in 
contrast, the transposition rule allows a record to rise one 
position at a time, thus decreasing the search cost more 
Slowly. Through simulation, Bitner has shown that Ov(MTF) < 
Ov (TR) when the retrieval probabilities obey Zipf's 


avciri pition. 


The second meaSure of convergence rate suggested by 
Bitner[1] is the number of requests, r, required for the 
teoLel@ecost Bom onen hetristiceto bes@lowen than thatrot the 
other. The total cost 1S obtained by summation of the search 
cost for the first r requests. This meaSure suggests that, 
under r requests, one heuristic may outperform the other, 
even though it has a higher asymptotic search cost. Some 
values of r are given in Table 1.3a5 at the end “of this 


chapter. 


In the remainder of this chapter, we shall study the 
convergence rate of the k in a row heuristics. Through 
numerical examples, we can show that the k in ae “row 
heuristics (MTF or TR) take longer time to reach their 
respective asymptotic search cost than the corresponding 
simple heuristics. This observation suggests that, although 
the k in a row heuristics (MTF or TR) have been’ shown to 
have lower asymptotic search cost than the corresponding 


Simple heuristics(i.e., k=) their slow rates of 
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CONVeEL gence geto BthevrTasymptotic value should not be 


overlooked. 


Now we shall consider the overwork for the simple and 
batched k@heuristtes. Intuitively ;sone woulduexpects fornko> 
1, Ov(MTF) < Ov(simple k MTF) < Ov(batched k MTF), Ov(TR) < 
Ov(Simpleneke TR) © <9@0v (batchedt ak eTR).. Ov(simplenkiM@r) a< 
Ov(simple k TR), and Ov(batched k MTF) < Ov(batched k TR). 
The reason’ “behind ethis is that the heuristic that is less 
likely to change the previously established order decreases 
ehe» Search) cost) more!’ slowly: Unfortunately, no general 
expression has been found for the overwork EOu the 
transposition heuristics and simple k MTF. However, the 


overwork for batched k MTF can be derived as follows 
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ye n k(p; = ee aeh =p) 
Ov (batched k MTF) = Zz 3 
hede jeie a 2(pi* - p;*)? 
PROOF: 


Assume that any initial arrangement of records in the list 
is equally likely, and the first request for retrieving a 
tecord is at t=0. Subsequent requests are at t=1,2,3,... 

1. Prob(R; is ahead of R, at t requests, 0 < t < k-1) = 
1/2, Since R,; has a 1/2 probability of being ahead of R, 
in the initial ordering, and no rearrangement occurs 
when t < k. 

Que Record Ry 1S ahead of Ry eat tb requests, tre ky under one 


of the following two conditions : 
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a. meither R; nor R; was requested consecutively k 


times in a batch and R; was initially ahead of R;. 
Lt 
Probe ole) py ane) 


(|x| refers to the largest integer less than or 
equal to x) 

b. the k most recent requests for R,; were in batch m, 
and R,; was not requested consecutively k times ina 


batch after m. 
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Since Equation (1.1) above reduces to 1/2 when O<t<k, it is 
also the general expression for Prob(R, is ahead of Rj, at 
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Hence, if each initial list is equally likely, the 


search cost at time t 


= Z£ pi i 1+ 2 Prob(R; is ahead of R, at time t}} 
: z 
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The first two terms of the above equation are the asymptotic 
search cost; hence the overwork at time t is given by the 


last term. Summing the overwork over 0 $< t S$ @ gives 


Ov(batched k MTF) 
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Ov (batched k MTF) = k Ov (simple MTF) 


for k > 1) and not. all p)’s are equal. 
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PROOF: 


Ov (batched k MTF) 
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k Ov (simple MTF) a 


To demonstrate the convergence rates of the other 
heuristics, the overwork for a list of n elements’ under 
different self-organizing rules iS approximated in the 


following manner. 


The conventional method of analyzing the _ self 
Organizing schemes is to model the different arrangements of 
the list as states in a Markov chain whose transition matrix 
is determined by the retrieval probabilities. Let 
Prob[z,‘t?] denote the probability of the state 7; at time 
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Wnitialiy; each of the n! possible arrangements is 
equally Tikelyo For avparticulan estate m,, Probla,° | 
1/n!, and successive distributions of the state Tt OL 


> 0, can be obtained according to the following equations 


t.eSimple MTF Heuristic 
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3. Simple k Heuristics 


Let (R,;;Ri2-.-Rim)o be the initial state where records 
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4. Batched k Heuristics 


ProODiR wsRpoere Rigo = ProbiRaahisesen ial 
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Thessasymptoric. distribution is approximated by the 
distribution when t is large enough, and the overwork can be 
obtained by summation of the difference between the search 


Gost at) each time instant and the asymptotic search cost. 


Tablese Inclajublalibeand: ae 2eembelow show the -expected 
Search cost and the overwork for a list of n elements whose 
retrieval distribution 1s given by Zipt'’s distribution. The 
results obtained suggest that although the k ina row 
heuristics are asymptotically more efficient, they generally 


incur greater overwork as well. However, the study of 
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overwork alone is not a good measure of the rate of 
convergence; the asymptotic search cost should also be taken 
into consideration. Since overwork is defined in terms of 
the difference between search cost at each time instant and 
the asymptotic search cost, the heuristic with lower 
asymptotic search cost would generally result in larger 


overwork. 


The second measure of convergence rate is the number of 
requests #irequined forsethe stotalicost tof one heuristic sto be 
Howereithan = aithat. <ofpithe other. The measurement is 
approximated in a manner’ similar to the approximation of 
overwork. Tables 1.3a and 1.3b below summarize the results 


obtained. 


From the study of the convergence rates, it may be seen 
that heuristics that are asymptotically more efficient may 
not be a preferable choice if few requests are made. 
Unfortunately, at present, there is no known method of 
deciding the optimal heuristic for the number of requests 


required. 


D. Overview of Subsequent Chapters 


In the next few chapters, we shall consider a linear 
Search §—sysStem with a fixed “number of records, and a 
fixed-size list, whose size may be smaller than or equal to 
the number of records. Only a subset of records can be 


Maintained in the list at one time. The frequency of 
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accessing each record is not known in advance. With some 
Slight modifications, the self-organizing heuristics 


discussed so far can be adapted to improve performance. 


The performance of the class of move to front 
heuristics is studied in detail. Through numerical examples, 
the class of transposition heuristics is compared to the 
move to front counterpart. The results obtained suggest that 
the transposition rule has lower asymptotic search cost than 
the corresponding move to front rule, and the k ina row 
heuristics again outperform - the Simple heuristics 


asymptotically. 


An application of the modified scheme is in the 
construction of paging algorithm in memory management. While 
the commonly known paging algorithm, Least Recently Used 
(LRU) , corresponds to the modified simple move to front 
heuristic, the results obtained suggest that transposition 
rule or k ina row heuristics may be incorporated into the 
construction of the paging algorithms to improve 


performance. 
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TABLES Veda 
The asymptotic search length for a list of n elements whose 
mMehmreva le prOvabidittesnare givenepy Zipt scdistribution, in 
terms of the number of searches through the list. 


number of elements 


Heuristic 3 4 5 

optimal ecst (optl) 1.6364 1.9200 2.1898 
Simple MTF rule 1.8545 2.04001 2.6104 
Simple TR rule 1. on 62 Zesoe2 2.4303 
Simple 2 MTF rule Ranh inys 2.1069 204227, 
Simple 2 TR rule 1.7414 2.0350 24329 
batched 2 MTF rule a hsis) 2 2.0842 23952 
batched 2 TR rule 1.7248 2.0206 250993 
Simple 3 MTF rule Pe Teo 20253 amie lgots| 
Simple 3 TR rule 1109 7S | esha Be 266 
batched 3 MTF rule 1.7004 2 OO fal 22977 
batched 3 TR rule 1.6050 129768 2eeocs 

TABLE 1.1b 


The asymptotic search length for a list of n elements whose 
retrieval probabilities are given by Zipf's distribution, in 
terms of the optimal search cost. 


number of elements 


Heuristic 3 4 5 

Simple MTF rule 141333 ehG-72 heabo2e 
Simple TR rule Pedal val ei ha2 1.1098 
Simple 2 MTF rule 1.0849 1.0984 1.1064 
Simple 2 TR rule 1.0642 ieOGU2 1.0562 
batched 2 MTF rule 109.26 120895 1.0481 
barohede2sTRerule 1.0540 1.0524 1.0499 
Simple 3 MTF rule i. 0499 1.0548 057.5 
Simple 3 TR rule lO Se-2 1.0349 hee Ose 
batched 3 MTF rule 10sSn 1.0454 1.0493 
batched 3 TR rule 1e0297 ie 2G 1.0285 
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TABLEAASZ 
MHESOVErWOL Kk Lora) List Off Sneseelements whose retrieval 
Probabilites ware given. by Ziptisedistribution. 


number of elements 


Heuristic 3 4 5 

Simple MTF 0.2006 0.4463 0.7978 
Simple TR 0.4580 650s Seo 095 
Simple 2 MTF IG07 SZ Zotar. 5.7746 
Simple 2 TR 19893 6.2007 12.836. 
batched 2 MTF 1.6454 4.4474 OR SSe3 
batched 2 TR 2.0169 10.2260 Lies od 
Simple 3 MTF Oro ee Meee 35 Sia/s5 1 
Simple 3 TR Symon ele) PLE A Cheese) 1520309 
batched 3 MTF HOGS 275622) 1550 70H 
batched 3 TR LOLI 16 URS Hesrovi sje) 17M 64416 

TABLE 1.3a 

The number of requests required for simple/batched 
transposition rule to have lower cost than the corresponding 
Simple/batched move-to-front rule, given that retrieval 


probabilities of the list of n elements satisfy Zipf's 
Gistribution:. 


number of elements 


Heuristic 3 4 5) 6 

Simple TR 6 10 ips) 20 

Simple 2 TR 25 Si 94 161 

batched 2 TR 36 78 156 280 

Simple 3 TR S83 248 658 1471 

batched 3 TR 165 597 1740 4050 
TABLE 1.3b 


The number of requests required for simple/batched k rule 
(MTF or TR) to have lower cost than the corresponding 
Simple/batched k-1 rule, assuming Zipf's distribution for 
the retrieval probabilities. 


number of elements 


Heuristic 3 4 5 6 
Simple 2 MTF 9 14 20 26 
Simple 3 MTF 38 83 167 308 
Simple 2 TR 4] 46 104 PANG 
Simple 3 TR 74 309 1024 2763 
batched 2 MTF Va 19 28 38 
batched 3 MTF 83 Zales 475 918 
batched 2 TR 24 62 io 336 


batched 3 TR 168 824 2Ba i, 8010 
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II. Self-Organizing Linear Search Systems with Limited 


Buffers 


530) -hanpethe -studymontselbi-onganrzing®linear isearchshas 
been limited to the sequential accessing of the same fixed 
number of records. However, this may not be a realistic 
representation of a sequentially access system. More often, 
one would not be dealing with the same list of records; 
additional records may be added to the system, or existing 


records are deleted. 


Here, we shall relax the assumption that the same 
records are to be sequentially accessed. We consider the 
following scheme: 

Given a set of n records, and a linear list (buffer) of 
size m, where m <n. Only m of the n records can _ be 
placed in the linear list at one time, according to one 
of the ,Pm possible orderings. At each instant of time, 
one of the n records is requested. To retrieve the 
record, the list is examined sequentially, starting from 
the @first position, until the required record as found 
or the end of the list is reached. If the requested 
record 1S9not. in the list, this *record will be recalled 
from some auxiliary storage and be placed in the list. 
One of the records in the list will have to be dropped 
fo make room for this = new “record, =according “to | some 
predefined replacement rule. 


Here, the performance measures of interest are the average 
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search length and the average number of fetches from the 


auxiliary storage. 


A random arrangement would require, on the average, 


(n-m)/n fetches from auxiliary storage, andi++#t+...+ #® 
+ min-m)> = (m+2mn-m?)/n searches through the buffer. 
With some slight modifications, the conventional 


self-organizing heuristics (MTF or TR scheme) could be 
adapted to improve the performance. The modifications need 
to be made in order to accommodate the replacement rule. The 
schemes described below are basically the same as_ the 
conventional MTF and TR schemes, except when a requested 
record does not reside in the list. We shall adopt the same 
NOtar Cima Se DetOne Let Tm), (i) §celemnto the shew position of 
mecordgnm, posa.tions |, barbers alhwecord Pusey posi tion bapeas 
requested. For records that are not in the list, assume that 


Phere posici0n 1S "external”. 


Modified Move to Front Heuristic 
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1.€., a requested record is always moved to position 1. If 
this requested record was originally in the array, say 
position i, then records in position 1 through i-1 are moved 
back one position. Otherwise, all records in array are moved 
back one position, forcing the last record to be dropped 
Prom the list. 


Modified Transposition Heuristic 
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le otherwise (including records 

in external storage) 
FOr i = external, 

m, for requested record in 
Tee = external storage 

external, Jj =m 

5 otherwise 


Wiese Vetsa record idoeseinotl reside jim vthe Gilvst! ‘when 
requested, it will replace the last record in the array. 
Otherwise, it will be exchanged with the record immediately 
preceding it. 

These schemata may not be truly representatives of 
systems involving insertions and deletions. However, the 
schemata are important for the following reasons. First, 
study of the performance with various buffer sizes 
demonstrates an analysis of the performance with varying 
amounts of insertions and deletions, since larger buffer 
Size represents less insertion and deletion activities. 
Secondly, these schemata can be applied to the construction 
of paging algorithms for storage management. The Least 
Recently Used (LRU) strategy used in paging algorithms 


corresponds to the modified MTF heuristic here. 
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To study the performance of this system, we _ shall 
assume that each record has an independent probability p, of 
being selected. The analysis of the performance involves 
modelling each of the ,P, possible arrangements of the 
buffer as a state in a Markov Chain. The one-step transition 
matrix is given by [P,;,;], which is determined by the p;'s. 
An account of the Markov Chain 1S given in Kemeny and 
Snell([9]. We shall be concerned with Markov chains that 
allow transitions from one state to any other states. Thus, 
the finite Markov chain is irreducible' and aperiodic’. This 
Markov chain is ergodic*® and the limiting probabilities for 
each state, 


ne =Verlaint 7 Gl” jetst iene Tae .0 
CI@ 


always exist and are independent of the initial probability 
Gusthrbut ron. — Phew limiting distribution me "(m4 toy. o.) 1S 
given by the unique solution to the set of equations 

x 1 ad | 


Tj = Ze Taine, jie RZ 


Let p(i,y) denote the long term probability that record 
Ry tiseiieposit1on ypland)zierthe probabilaty that iRy 91s snot 


in the burter. 

'A chain is said to be irreducible if every state is 
reachable from every other state 

2For every state i in an aperiodic chain, we can find two 
positive integers k1 and k2 such that the (k1)-step and 

(hk?) =Step transition probabilities, Pi; > 0 and Pi? => 90, 
and such that k1 and k2 have no common divisors other than 
ier 

3A finite-state Markov chain which is irreducible and 
aperiodic is ergodic 
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Dil,y) ) = 72 400 (esummavion of all those 
mn, such that R; 1s in 
position y ) 

Chee eeoeo. Bpirsy) 

= 1 


Thesvexpected search length of a request, starting from 


Position, I, Sis given: by 
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IO eh og where 


ui = e yp(i,y) | + mz; 
yi 


=a 


HM, 1S the expected search length for record Rj. 


The average number of fetches from the auxiliary 


Storage when a record is requested is given by 


Assume, thatWonw2ep2'i2.ae.e ©2y pre —Thenerthes optimal 
ordering would be to place the m most frequently accessed 
records mit) sehe bULter,, accoraing to the order Riko. ..km. one 
optimal expected search cost, u(OPT), and the optimal number 


of fetches, v(MTF), are given as : 


u (OPT) 
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A. Simple MTF with Limited Buffer Size 


The modified-MTF scheme corresponds EO the LRU 
independent reference model (see Coffman and Denning[3]). 
her oO, De the Set or Vall © possible, contigurations ~oef =the 
MaStzemDULlten. and %, De arpabeiculear state, m,peO,. Let ay) = 
PRR. weRime wWhete (les 17 = nuand Rie Ry,) for) jek. “or 
ease of reference, those records that are not in the buffer 
ave dabelled as Rj imei) through Ry. Given that each record 
R;; has a probability p;; of being requested, the stationary 


Gustwibucvonmomrunerstate 7, 1S Giveneimnuhs. 
THEOREM Yel: 


Prob[7z, ] = Probl Rea Rios kira) = I] 


Before proceeding to the proof, we shall present’ the 
following lemmas which indicate the two properties of the 


above expression for 7. 
Def ine 


Ce 2 Seon 


and, 


ai?) te uapkaeaigstte? nidtdaogu tia <c.7e8 oli “am 
+ @ 9ST rae. ts 94072 saju si Stand ad wins, sited oan ve 
363 ~ ane 106 a f*. Zeus, TS re: S) eante fares e ra) 
settud ed? fi sof ese 2a02 abagaed ag6a5 , ShanteTe® T9 


bee ey «aes Jats isi Or. Apes! “yea. 48 Ba Hel ledat <7 
‘c retiasdasg @ oath 


- 


qieapi @sr= ila). Se ¢s0ps2 pte is 
fit nl teco_ef. ai ogite ody. Io Wolpe 


_ 


ar 


= ‘lari ces @ RIGOR =~ [0h 


one dates ee ae gu ve 1a ae yun /Aaoseay 
oth - 8a pel Tisai. (ret ade sreoehrs B21 Ae. Segeee or: ime ' 


“7% é § (aa yaqs4 


26 


LEMMA 2 ta: 
ProblRie Rumsey Rist Ricys 1) eck moO. a= oy bees -y Sm 
PROOF: 
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Case 2: y = m-1 
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Def iné Prob,(R,,Ri2...Rim) to be the stationary probability 
Opmethe “Statewiwhere srecords "Ri pARi2.,.-.pRim ane warranged 
accordingly, and “the size of buffer is m2 Again, © records 


thatmare mot im the burter are labelled (R27, through R,.. 
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This. Lemma indicates that a set of m records, RijjRyo.. Rims 
has the same probability of occupying the first m positions 


in an (m+1)-size buffer as in an m-size buffer. 
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AD Da BPuObi aR Riles cstRieRi 


Tl (eee. || E Pie | 
j= 4 t= j t=m+ 1 


Prob. CR, ainsi) ® 


TO visualize the above lemma, one could also think of 
the n records as being arranged in an array of n entries, 
reorganized according to the rule of the conventional MTF 
scheme. Suppose we are interested in a m-size buffer. 
RECOLdSathat CCCUpY —Fpositions mt] through’ nm “are to be 
thought of as residing in the auxiliary storage, so that a 
request for one of these records will necessitate m searches 
through the buffer and a fetch from the auxiliary storage. 
In other words, positions m+1 through n should be treated as 
the same position, namely "external". Notice that the 
modified-MTF scheme is satisfied : a requested record will 
always be moved to the top of the list, and requesting a 
record anywhere between m+1 and n (external) will force’ the 
record in position m to be moved to m+1 (i.e. external). 
Thecefore, the Stattonary distribution oOfva contiguration an 
a linearssearch system with anem-Ssize bukter 1s given by the 
Summation of all states in the conventional MTF n-element 
Configuration that has the same sequence of m elements 


appearing in front. 
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PROOE SOR THEOREM 2 al 


The equation given in Theorem 2.1 must satisfy the following 


expressions 
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Case 71 : to show that prob[z;] satisfies (i) 
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Problihws : ack; PRR pepanes te Rim. 
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j=2 x=m+ 1 
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Case 2 : to show that Do ena 


First, we shall show that, asymptotically, record R, has a 
probability pp, Of being in position 1, tor all bufter sizes. 
In other words, summation of stationary distributions for 
all configurations with record Rj in position 1 1s equal to 
Di- 

PROOm Dy induction: 


True for buffer size m = 1, 


Prob[R; ] = a = De 


ee ON 
= 4 
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Suppose this is true for m=k. 
There are ,_-;P,-; possible configurations that has record R; 
in the first position. Let one of these configurations be 
denoted by 7.(k};-z = [RiRj>c...Rj,], where 1sj,Sn and jaFi. 
Then, 

- Prob[m,(k)] = op; 

For m=k+1, there are ,-;,P, possible configurations that have 
FeCcorauRy in thers inst position. Divide these wy -4e, Dossiuve 
CONnmgGurat1OnNS Into jeuP.e; Groups: of (n-k) contvgqurations 
each. Within each group, the same k records appear in_ the 
first ekeDOSIL ions sand) in the same orders Denoresonemor 
Eheseme groups by tz. (kel), where zt is the set 
tRER AR eer GR ly i Sho Kt lice a) oreaeenenre dt ier ore cache2-, 
there is a corresponding 2 in ©, such that the same k 


elements appear in the first k positions. 
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Prob. (Ri Rip... R yi.) by Lemma 277 10) 
Therefore, 

Eo Prob[m,,(k+1)] = ~ melons ed = Di 
By induction, each record R, has a probability p; of being 
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The next two lemmas apply when Lemma 2.1b is satisfied. 
LEMMA 2.2: 


SPrOD a Rees)  =aPLOb why) 7) Saniws un 
zu reters to the summation of States in ©O, that have record 
R; in the first postion. This lemma states that the 
peebabiivtey © of having vecord» Ry in position It etorea ll 
buffer sizes, is the same as the probability of having R; in 
a Single-size buffer. 
PROOF : 


Follows directly from the proof of Theorem 2.1. & 
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number of fetches from auxiliary storage. Then 


LEMMAR2 To! 
AMOWGe 9) cy UG TeGiy) 5 OS in = ol) 
PROOE 
Cases er nr= 0 
ODVWOUS sy erie 0) =e OR evanic paO )) =e 
When the buffer size is one, one search through the buffer 
is needed to determine if the requested record is present or 
HOL. Hence, (1) = 1.Therefore, 
(EA) Bante sp (CO!) 
Gases ain > 0 
Let a,(m) be a particular configuration of a buffer of size 
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Following the same argument as in the proof of Theorem 1, we 
could again group: the ~ possible’ configurations of an 
(m+1)-size buffer into ,Pm groups, where all configurations 
in the group have the same m elements appearing in the first 
m positions. Denote this group by 7,.(m+i), where m;, is the 
SCERE Hawise. eR aR et Re Le. ene te FOr feacht aia, we 
could find a corresponding 7; in Qn such that the first m 


elements are the same. 
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(by Lemma 2. 1b) 


Theorem 2.1 allows us to write a rather complicated 
expression for the performance of the modified scheme using 
the mover to front heuristics. The retrieval cost, in “terms 
of expected number of searches through the buffer and 
expected number of auxiliary memory fetches, 15 given in the 


next theorem. 


THEOREM 2.2 
Under the modified move to front heuristics, the expected 
number of auxiliary memory fetches for a linear search 
system with n elements and m buffers is given by 
pina Ae k= 3 Pi, l Probiay,), and 
nee OO oe 


the expected number of buffer lookups is given by 
im) = 1+ z % Ce ohne Prob mi), 
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PROOF: 
PAGO) te oe Dizi sy where z,; is the probability that 
‘<= Ri  (Sineotein buttery 
= 0. [ 1 Pea pi; ] Prob(7; ) 
Tee On Mee 


By Lemma 2.3, 
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B. Simple TR with Limited Buffer Size 


Similarly, this transposition rule can be modelled as a 
Markov chain. Let Tj be a particular state, 
7,=[R,;,:Riz2--.-Rim]. The equations below describe the _ states 


of the Markov chain. 
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Unlike MTP Ss thelasymptotichdistrabution of | thepe states 
Gimthe yabove Markov chain 15 ditficult to obtain. We fail to 
obtain a general expression for the retrieval cost of the 
modified scheme using the transposition rule. Although the 
exact (retrieval cost cannot be determined here, observations 
suggest that, asymptotically, the transposition rule should 
Deval Heast as e€fficient as the move to front heuristic, ia 
terms of both the expected number of buffer lookups and the 


expected number of fetches. 
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recOLdaRy, precedes record Ri for whe Sconventional =move, £o 
LEONE and the transposition heuristic respectively. 
Conventional refers to the self-organizing system where the 
buffer size is as big as the number of records. Rivest [13] 
shows that b'(i,j) 2 b(i,j) when p, 2 p;. This observation 
suggests that under the transposition rule, records have a 
better chance of being arranged in a near optimal order than 
under Mine mMovestourronterule.s LetiaA pie, . Maine cenore 
the spectrum Ot self-organizing heuristics in the 
conventional system. Under the A; heuristic, every time a 
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positionsless: than’ i1).0 Aqecorrespondcs to); the transpositmon 
Tubes while: Apegst corresponds to  theymoverco Eronte rule. 
Bitnenii conjectures that ther average search time for Ay is 
asymptotically better than that of A,,1,. With these previous 


observations in mind, we shall now proceed to examine the 
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performance of the modified scheme under the transposition 
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Another way of looking at the modified transposition 
rule) would “be to™treat the records as being arranged in an 
array with m entries. To study the behavior of the rule when 
thes DUETer size, iS) mm, positions. mt) “through neshould be 
treated as "external". When a record between position 1 and 
m 1S requested, it will be exchanged with the record 
preceding it. However, requesting a record that resides in 
position between m+1 and n would bring forward the record to 
position m, and hence, one of the following heuristics, A,, 
Az,e«+, An-m, 18 used when an "external" record is required. 
This observation suggests that the performance of the 
modified TR rule should acquire a characteristic that is a 
Comba Nati Ongore hess OC Ase” 06) ,RAReU Ae heuristics.) Alscenote 
that when thes’ bubter “size a6 jipothe modified TRerule 21s 
exactly the same as the corresponding modified move to front 
rule. When the buffer size is greater than 1, the modified 
TR rule would have a higher probability of achieving a near 
6ptitial, Ordering than the corresponding move to front rule, 
Sincewaucombi nation of Aq) | hc) es oy Ane mee eUcescumSMOuLa 
perform ibetter than the A, -7(MIF) rule “alone With this 
observation, it is suggested that the transposition rule 
should be at least as efficient asymptotically as the 
corresponding move to front heuristic, for all buffer sizes. 
Unfortunately, we have not been able to provide a proot «cf 


this conjecture. 
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Tables 22a, 2,2b, 2.4aPandec.4b show the performances 
of the modified system under move to front and transposition 
rule, for a list of n elements whose retrieval probabilities 
are given by the Zipf and Wedge distributions respectively. 
The mewedges distribution ws Scloser toyriniiormity, py 
ein i>1)/ (n(n) ). Pithe retrieval’ cost for the move to front 
rule is dinectly calculated trom Theoremez.2, awhile a™.good 
approximatron® of "the cost for) the “transposition rule is 
obtained through calculating the probabilities of each state 
cieesuccessSive times met=(, 1,2). 00 stra eSainilam@ manner as) ain 
Chapter 1. Appendix D further describes this approximation, 
and includes some simulation studies on these modified 


heuristics. 


Prom Tables §2.2a, 2.2b |; 24a and 2Z.4by the results 
obtained do serve as numerical evidence for the suggestion 
that transposition rule is more efficient than the 
corresponding move to front rule. The simulation results in 
Appendix D also “supports this “iconjecturée, However, as 
mentioned before, the validity of this claim remains to be 


proven. 
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TABLE .Z.1 
The optimal retrieval cost for alist of n “records with “nm 


buffers, whose retrieval probabilities are given by Zipf's 
Dus tribution 


number of Eetrieval vprobabi lity of erecord 
records Cgaveniibyrzi pis edistrirsupion 
] 2 3 4 S 6 
3 025455 Oe 27 27 Ones 
4 0.4800 0.2400 071600 0.1200 
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6 0.4082 0.2041 O 13164 51020 0.0816 0.0680 


Theseost iS im termsmot the numbey ef bublerelookups, 9 and 
the expected number of auxiliary memory fetches, pv. 
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TABLE 2.2a 


The asymptotic retrieval cost for a list of n records with m 
buffers, whose retrieval probabilities are GiVeneoyecapins 
distribution. The cost is in terms of u, the expected number 
of searches through the buffer, and vy, the expected number 
of fetches from the auxiliary storage. 
cs 
number of buffer asymptotic retrieval cost 

records size Move To Fronterule Transposition rule 
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TABLE 2.2b 
(ratio: ©f retrievals to. optimal cost) 


The asymptotic retrieval cost for a 17st. of nm records with m 
buffers, whose retrieval probabilities are given by Zipf's 
austribution, The cost: is') ins terms: of the optimal number 


buffer lookups and the optimal number of auxiliary memory 
fetches. 


number of buffer asymptotic retrieval cost 
records size Move To Front rule Transposition rule 
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TABLE (203 
The optimal retrieval cost for a list of n records with m 
buffers, whose retrieval probabilities are’ given by the 
Wedge distribution 


number of retrieval probability of record 
records (given by Wedge distribution) 
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TRE -COSt 15, im terms of tehe number otmbuffer lookups, ny and 
the expected number of auxiliary memory fetches, pv. 
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TABLE 2.4a 


UNG asyMprobic retrieval costi for a list of noreeords within 
buffers, whose retrieval probabilities are given by the 
Wedge distribution. The cost is in terms of u, the expected 
number of buffer lookups and yp, the expected 'numberl Met 
auxiliary memory fetches. 

a ee ee ee eee 
number of buffer asymptotic retrieval cost 

records Size Move To Front rule Transposition rule 
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TABLE 2.4b 
(ratio of retrieval to optimal cost) 


The asymptotic retrieval cost for a list of n records with m 
buffers, whose retrieval probabilities are given by the 
Wedge distribution The cost is in terms of the optimal 
number of buffer lookups and optimal number of auxiliary 
memory fetches. 


ea ee ee ee 


number of buffer asymptotic retrieval cost 
records size Move To Front rule Transposition rule 
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III. K in a Row Heuristics for Linear Search Systems with 


Limited Buffer Size 


The® notion of &ke ino *avsrow hetristic “could againebe 
introduced to the memory-free self-organizing rules for the 
scheme, with tamited buffers fherapproach 1s Simi lagetesthe 
anewdrscussed in’ the ‘conventional linear. Search system, 
i.e., the modified MTF (or TR) rule is applied only when the 
Same element is accessed k times in a row. For this class of 
heuristics, logz(nk) extra bits of storage are needed for a 


System with n elements. 


In this chapter, the two k in a row heuristics, simple 
and batched k, are studied. We shall show that, the 
self-organizing heuristics for the linear search system with 
limited buffer size can perform more efficiently 
asymptotically, with the introduction of a few extra bits of 
storage. The asymptotic retrieval cost, in terms of both the 
expected number of buffer lookups and the expected number of 
auxiliary memory fetches, can be reduced when a few previous 


requests are "remembered". 


The performance! of the k in a row heuristics is again 
analyzed by modelling each of the on possible 
configurations of the m buffers as a State in the Markov 
Chain. The class of move to front heuristics is studied in 
detail. While no general expression can be obtained for the 
performance of the transposition rule, numerical examples 


are provided to demonstrate the efficiency of the 


45 


a6 a ASG ees AE at | | 
erage foigee) TEDes: Lanelatn ora all s it 20 
oii andy qin Getlaie et ahr; (FT-spy A aiatins oe 
ta ssaie wigs 35% ).063 6 9! pat af fisehi= se wi snomala @ 
« vol Seheat aaa epeters, io solg sets (in) at, at 

~tirensie, sc A508 


atgehs 29) tehived oct. 2 GL womls Gee -tergqet eons at 
ony ,2602 Werle Livse «SF ta Bode ere 6 6|.® batosed” i 
dsiwiiepzavea dataee vane! Gr? 187 eabtotawaci poisicegae*eaee 
?fsiteisciis Sen wsoTie> e7 muk “wa? 1 oe? af 
20 arte nose eel a T¢ Hal WSubadgh? Sfs.Avaw . Plage : 
ed: dsoe 29 Spier 4) Ges (ave) Fer 2) 2coqnyee Oe? Ee 
to tadaye Sedpaties 302 Gort nteyirad erend fo Sema ¢e3 
aooivreng «45> @ Ten Baeuba= Strasse: reno 2s) yom yIakt 


. "ts sehrenee’ ste. @ 


ai@pe 2/ wWoogelived was ni @-sa2 le sanearg@ised «A? 
sidieseg, @.>» of%) 40 /fana PAL LieSem, Ed 
_ va sts 4) afs$2\), 2 Oh 63° %ie m ant bp Babi em 
al Gethuse 2¢ aSiseigdga dior) °s 49m fo isle SAT | 
40k- Ie} contanin ei q8a/eptezenge~ farsriag om atlew 
edvegeas td sew’ fain sqrt tnoneniy: , ald Go aa 
Sad) 3M, GavaPoitte 9 sy | ans sanoGieh - 


. | pg.) ofe 


46 


transposition rule over the corresponding move to front 


rule, 


Once again, we are looking at a self-organizing linear 
Search Sysitem*with meirecordsm and sem ~buftteres . where. -each 


record, R;,;, has an independent probability pp, of being 


accessed. 
A. Simple K Heuristic with Limited Buffer Size 

The modified MTF (or TR) heuristic is applied only if 
the same element is accessed k times ina row. 


Modified Simple K MTF 


We first analyze the simple k heuristic with the 
modified move to front rule. The performance of this 
heuristic 1S analyzed in a manner analogous to that for the 


modified MTF heuristic. Let 7; be a particular state of the 


Markov mechain». 6 One  BeCbom, = [Ria Ric. ok aleenG. lec 
those records that are not ain the buffer LOL this 
Gonttguratiom bbe® Labelled “as Ri(mais ehbough | ki, Bach 
record R;, has a corresponding probability pi; of being 
accessed. 
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The equations below describe the Markov chain : 
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HHAEOREMS SAG 


Under the modified simple k MTF heuristic, the stationary 


distribution of the state 7, is given by 


Probia, ] = ProbERyaRisess Rael 
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we wo Sie 
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Epi ee Leek ey) 1 | 
ae tr 


PROOF: See Appendix B. 


In Appendix B, we show that 
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as in Lemma 2.2. 
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auxiliary memory fetches is given by 
Dei) a= Delete 2 Oe pmlsereyel ends and 
7; 60. J 
the expected number of buffer lookups is given by 
ea es SSS ice 2 Ses ep 
Bann rare) ue 
where Q, is the set of all configurations of the buffer when 
Ene BDUPECr “Size 1S 92,) and 7,60. 1S a particular State, 7. = 
[Regkvee eerie]. Probla,] i6 given by Theorem 3. 1. 
PROOF; 


This is equivalent to Theorem 2.2. 
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number of buffer lookups as well. 
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(See Appendix A) 


Without loss of generality, assume the records are labelled 
inmenoOn=increasing Torder of retrieval @probability. Tron 
Equatione(3s2) eit listobvicus that Sthe sesmaliestecvalues for 
pim)86is Zins, “P)) Vand “this iscachieved when the (n=m) 


smalvest “records are not in the buffer. 


We have shown earlier that for the class of move to 
front heuristics, the n records can be treated as being 
arranged ina list, and reorganizing occurs in the same way 
aS in the corresponding conventional system. When the buffer 
Size Soll interest “is my positions m+] tthrough miwould tbe 
treated as "external", and the distribution OF the 
econiiguratron ‘could’ bespreserved. Tables 3. twand 3.2 below 
Show someavalues=o& ppli;j)? -1%eu, probability of record eR; 
im. position “7 oeunder the “simples kecheuristies Sior athe 
conventional system, where the buffer size 1s as big as_ the 
number of records. When the buffer size is m, p,(i,j) is the 
Samemasminm the COmventvonal «system, [Ormimm@ ce] -= hives lhe 
DeObabrbity mot ErecordmeR; Piiotebeing “ine thes bubte: i serhen 
given by summation of “peli,j) in the conventional” (system, 
fOr as eerangingafrromemt isto neelableses Mmands 3° 2e1Ulist rave 
how the p, (i,j) changes with k. We can “see from? the ¥ two 
tables that, as k increases, simple k heuristics tend to 
arrange the records more and more closely to the optimal 
arrangement, i.e., record R; has a highersprobabidicy fot 
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increases. Previous study by Gonnet, Munro and Suwanda[5] 
has shown that, under the conventional Simple k heuristics, 
BeOS ben Gi ei )fivhen p, = Daeg). Teterss tothe 
probability Dnetey record Rip eelce si net rhoOnt Ol wm Ri someriie 
observation suggests that the simple k heuristic attempts to 
rearrange the Becouds more closely to the optimal 


arrangment, as k increases. 


Unfortunately, no concise general expression is 
ebtained for pyii,j)etor this simple k heurrstic, “and” thus 
the cost for various k cannot be compared directly. We then 
have to examine the retrieval cost for specific 
distributions. The retrieval cost for a list of n elements, 
assuming the retrieval probabilities Satisfy Zo tacs 
dist ua butiron § as calculated, according to Theorem 3.2... and is 
Given in Tables 3<3a and 3.4a. The ratio of ~the retrieval 
cost to the optimal cost is presented in Tables 3.3b and 
34D Mimespectively..m fables @ 32 5ay Gt ey o.0e and 3).6D 
illustrate the retrieval cost for a list of elements 
satisfying the Wedge distribution. Numerical examples 
presented Show that. “the simple =k heuristic hase lower 
number of memory fetches than the corresponding k=] 
Reurietic.. for all butfer sizes. Consequentiy,~ the vexpeccred 
number of buffer lookups for the simple k heuristic 
decreases as k increases, for all buffer sizes. Thus, at the 
expense of a small amount of extra memory, log2(n)+1log2(k) 
bits to be exact, the modified simple k heuristics can be 


asymptotically more efficient than that of the corresponding 


_t 

lei 

ota ot pete Sion = 

het oy KBE) S098F, nl ee 

O39 asqmedis Siteiseed 4) alqmle ate a d 

lemisqgo sf3 92 ‘1o8a01° S100 “aerobss sits 
Ta ath AG 


ei, @olezsagaa letensr palpi an | a ieteneoney 
608. \EGe_ ,OFtafsus4.4 sila segs 103) ti. Fgerase do 
nea? BW pelaosh<& Sa tequey! a PO) 0K 1 2uektey at genie a 
s£7)590¢ io4 ise7 ierata2 3+ ss} esilmaxe © 
,asfeuste «¢ to s2il 6 x0} seot feeshsist sat saa 
g. 3q2% Vieiizac esiti Ligéguotits Teveias2a is 
2inGAS .5. t-rempehto! Enibiccoan, Baleluctesret <fhet tend Det 
te¢siz7si esd io 6leex sat ty Gebita panne £ #esliet a cove. 
wed dé.& codan’ oe7qse27 2° 3eoo ‘pmbseia gts of seas, 
43.2 bee gui” He fe2.c egteém wlovigompees cee 
aziamsis lo) Yatie = dot ged 4av4iziay, odd ssopvawee? be 
esigueaxs fastisuyll. Sepos fideaigs® snbeW “acy gnigiebsen” 
evel ¢ Sod olycisusd) # sianie) od> sade sbile) DesmeRER 

Posies J 


i<4 ondefogestics sda-nsitd eeiojai yoeeem te - sede 


: 
=! 


a 


Setnegi2 E52 (¥finsupsaneD.zeeia t5°3ud lis ed ohte bsume 
siazicusd 4° sigmte, sdzuici eqifteolreatiee . 16 radar 
and 46 .2unT .zebla watiod iis.2c! «832sS95n1) 4 as. 
had epolein) seo! © . roca 725% to. snyoas than @ §aq 
od nS2 evitetjuss 4 aiqmla-beilibom sdt.gdoass ead—-@208 
grtinagesiios’ 613, 10 fads ek? Tesi Scste,Sene) ge 


D2 


Simple move to front heuristic. 


Modified Simple K TR 


The Markov chain used to model this transposition 


heuristic is given by the following equations : 
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No general expression is obtained for the solution of 
the above Markov chain, and the exact cost tou the 


transposition Sule is Spoptikal unknown. We can only 
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demonstrate, through numerical examples and _ simulation 
results, that the transposition rule can perform better 
asymptotically than the corresponding simple k move to front 
rule, in terms of both the expected number of buffer lookups 
and the expected number of auxiliary memory fetches. A good 
approximation jor) the Viretrieval’ cost; \Sfor a list of n 
elements with m buffers, is obtained in an analogous manner 
esminechapver 2. Tablec 4098, 13.00 cecldnmo. ob oe bam oe Oe, 
3.6a, and 3.6b illustrate some values of the retrieval cost. 


Appendix D includes some simulation results. 
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the simple k"MTF heuristic is adapted. 
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TABLE Sosa. Simple 2 Heuristic (Zipf's Distribution) 


The asymptotic retrieval cost for a list of n elements with 
m buffers, whose retrieval probabilities are given by Zipf's 
distribution. The cost is in terms of u, the expected number 
of searches through the buffer, and v, the expected number 
Of tetches*irom thevauxiliary storage. 
a ee ee ee ee ee 
number=ore= butter asymptotic retrieval cost 
elements size Move To Front rule Transposition rule 
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TABLE 3.3b : Simple 2 Heuristic (Zipf's Distribution) 
(ratio of retrieval to optimal cost) 


THesAaSyMpLoOticwretrieval cost for a list: of mevements: | with 
m buffers, whose retrieval probabilities are given by Zipf's 
distribution. The cost is in terms of the optimal number of 


buffer lookups and optimal number of auxiliary memory 
fetches. 
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TABLE 3.4a : Simple 3 Heuristic (Zipis) Distribution) 


DhewasyMprovic retrieval cost for a list of nm elements with 
m buffers, whose retrieval probability is given bys Zi pt Ks 
GeStevbution. 
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The asymptotic retrieval cost for a list of n elements with 
m buffers, whose retrieval probability is given by Zipf's 
distribution. The cost is in terms of the optimal number of 
buffer lookups and memory fetches. ; 
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Table 3.5a : Simple 2 Heuristic (Wedge Distribution) 


The asymptotic retrieval cost for a list of n elements with 
m buffers, whose retrieval probabilities are given by the 
Wedge distribution. The cost is in terms of u, the expected 
number of butter elookups,© sand vp, the expected number of 
memory fetches. 
eg ee Se ee oe ee ae 
number of buffer asymptotic retrieval cost 
elements size MOVESTOoOFronte rule Transposition rule 
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Table 3.5b : Sim le 2 Heuristic (Wedge Distribution) 
(ratio of retrieval to optimal cost) 


The asymptotic retrieval cost’ for a list of n elements with 
m buffers, whose retrieval probabilities are given by the 
Wedge: tdistribution:!’ Phe jcoste ist idan ternstof the optimal 
number of buffer lookups and memory fetches. 
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TABLE 3.6a : Simple 3 Heuristic (Wedge Distribution) 


The asymptotic retrieval cost for a List of n elements with 
m buffers, whose retrieval probabilities are given by the 
Wedge distribution. The cost is in terms of u, the expected 
number of buffer lookups, and »v, the expected number of 
memory fetches. 
ee ee ee ee 
number of buffer asymptotic retrieval cost 
elements size Move To Front rule Transposition rule 
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TABLES: 6b : Simple 3 Heuristic (Wedge Distribution) 
(ratio of retrieval to optimal cost) 


ThevasyMprotic retrieval cost for a last of n elements with 
m buffers, whose retrieval probabilities are given by the 
Wedge distribution. The cost is in terms of the optimal 
number of buffer lookups and memory fetches. 
eee 
number of buffer asymptotic retrieval cost 
elements size Mover To Pronusrule Transposition rule 
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B. Batched K Heuristics with Limited Buffer Size 


Here, requestS are grouped into batch of k elements. 
pheamodiiied heuristic (MTR or TR)-is applied only if the k 


requests in a batch are for the same record. 


Modified Batched K MTF 


the performance “of this heuristie as  studiedyinea 
manner analogous to that in the previous move to. front 
heusisticseit Bach; of sthée woossables configurations ‘of the 
buffer represents a state in the Markov chain which is 


described by the following equations 
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Under the batched k heuristic with modified MTF rule, 
the stationary distribution of the state 7 1s given by 


m Din. 
Probl z,] = Prob Ravers = ree 
LZ ek 
ae 


J 


end betbuge St 9 2itRE ry ae site th-/scueetoiaeg o@ 
iiset off Byam Filo; Ve3Q ens fie sans Q? sgtgotens: 3s 
@ygo satitpies!*nos afdieneue ens is vet ,aplrabseer 
a) -fdathte Wibt> 5816 se He ea) 4's cf2s@e7ge: 


ene Seba gatietiot ade qt badeas 


te er To 8 )ao2 


- 


i eg Hes é ; 1) .78 7 «a 1GG9S 


alum TIM eat blhow 43a) otigigvet)@ Detaied ange " 
"2 -e¥lg af ) © SGG2e sits. th colauei26éin Tee 


| 


ee 


~ ‘Fo. wwe) EeeN® ebb) 


—T 


“~ tl & 
vet 


65 


PROOF OF THEOREM 3.3: See Appendix C. 


In Appendix C, we show that 


n 


PrODAUR i RuewecRaw) = BPP rob yn Rue Rio ee Ra Re 


x=m+ 1 


Themabove conaition allows ws to mriute 


Ome) or a) ey in ee eS en er at 


as ane Lemma 222. This condituon in fact holds for all the 


MOVE EO ELON: NeEULrIStles Giseussed so tar. 


Similarly, the retrieval cost of the batched k MTF 
heuristics 1S given in the same form as in the other move to 


BeOnBeheurNster1 cs, 


THEOREM 3.4 
For a  limited-buffer linear search system that adapts the 
modified batched k heuristics, the expected number of 
auxiliary memory fetches is given by 
p(m) = Pa es 2 Dijeerobln a. and 
7; €Qm se 
expected number of buffer lookups is given by 
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PROOF : 


Similar tO. the proof of Theorem 2.2. 


The perlormance of this heuristic iS Studied» in the 
Same Way aS in the simple k heuristic. First of all, we 
Shall examine the probability of record R, in position j. 
Tables 3.7 and 3.8 below illustrate some values of p,(i,j), 
fOr Varlousek Swit is interesting to mete how the  py(ij 4) 
changes with k, and also how it changes from simple k to the 
corresponding batched k heuristic. The values of p,(i,j) 
presented suggest that this batched k heuristic could 
perform more efficiently asymptotically for larger k. As_ k 
increases, each record R; has a higher probability of being 
arranged in its respective optimal position, i. Furthermore, 
this batched k heuristic also indicates that it can be more 
efficient than the corresponding simple k heuristic. 
Numerical results for this move to front heuristic are 
directly » calculated. according to) Theorem 3.4) onda are 
presented qin Tables .349a, 3.9b,. 3. 10a, 3. 10D, 3. | Vai 3. 1b, 
g.iga, and 3. i2b. The results presented in, the “tables “show 
that retrieval cost decreases as k increases, and the 
parched ek  heubiSstlc = can yin tact. be = bereer than the 


corresponding simple k heuristic. 


Modified Batched K TR 


The Markov chain describing this batched k 


transposition heuristic is given as follows 
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Through the numerical examples presented in Tables 3.9 


Ennoughs 3. li7 ands thes simulationeresultssan Appendix, D, gt is 


indicated that the transposition rule can be more efficient 


than 


the corresponding move to front rule, in terms of both 


the expected number of buffer lookups, and the expected 


number of auxiliary memory fetches. 


In this chapter, we have shown that, when a few extra 


bits of storage are utilized to remember a few previous 


requests, the asymtotic retrieval 


self-organizing scheme can _ be 


numerical examples presented, the 


cost of the limited-buffer 


further reduced. From the 


asymptotic retrieval cost 


of both the simple k and batched k heuristics decreases as k 


increases. 
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— TABLES Sa / 
ine Pe probabilitysl ofverecords isinpeposition j, assuming 
retrieval probabilities satisfy Zipf's distribution. Batched 
k move to front heuristic is adapted here. 
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TABLE 3.8 
Eee pEObabimitt yor wrecord Milman y DOSl elon yy assuming 
retrieval probabilities satisfy the Wedge distribution. 
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TABLE 3.9a : Batched 2 Heuristic (Zipt' s Distribution) 


The asymptotic retrieval cost for the list of n elements 
with m buffers, whose retrieval probabilities satisfy Yew yogi as. 
distribution. The cost is in terms of uw, the expected number 
of buffer lookups, and v, the expected number of auxiliary 
memory fetches. 
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TABLE 3.9b w= Barcheds 2eHeurnierics (2ipr SsiDistra putlon) 
(ratio of the retrieval to optimal cost) 


ThevasyMPLOvie, Tetrieval\ costs Lor thes lise oben melselements 
with m buffers, whose retrieval probabilities satisfy Zipf's 
distribution. The cost is in terms of the optimal number of 


buffer lookups, and optimal number of auxiliary memory 
fetches. 
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TABLE 3.10a : Batched 3 Heuristic (Zip ss Distribution) 


The asymptotic retrieval cost for the list of n elements 
with m buffers, whose retrieval probabilities satisfy Zipf's 
Gustr vbution.eeihe cost is timbiterms of i> ithe expected number 


of buffer lookups, and v, the expected number of auxiliary 
memory fetches. 
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TABLE 3.106 : Batched 3 Heuristic (Zipf's Distribution) 
(ratio of the retrieval to optimal cost) 


The asymptotic retrieval cost for the list of n elements 
with m buffers, whose retrieval probabilities satisfy Zipf's 
distribution. The cost is in terms of the optimal numebr of 


buffer lookups, and optimal number of auxiliary memory 
fetches. 
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TABLE 3.1la : Batched 2 Heuristic (Wedge Distribution) 


The asymptotic retrieval cost for the list of n elements 
with “mm Sibutfers)) whose retrieval probabilities satisfy the 
Wedge distribution. Thetcost dich in: termelot Ww) the expected 
number of buffer lookups, and vp, the expected number of 
auxiliary memory fetches. 
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TABLE 3.11b : Batched 2 Heuristic (Wedge Distribution) 
(ratio of the retrieval to optimal cost) 


The asymptotic) retrieval’ cost) for) the list sof on! elements 
with m buffers, whose retrieval probabilities satisfy the 
Wedger distribution. Theveost is Gin’ terms ofp othes opeimal 
number of buffer lookups, and optimal number of auxiliary 
memory fetches. 
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TABLE 3.12a : Batched 3 Heuristic (Wedge Distribution) 


The asymptotic retrieval cost “for the list of n elements 
with m buffers, whose retrieval probabilities satisfy the 
Wedge distribution. The cost is in terms of 4, the expected 
number of buffer lookups, and vy, the expected number 
auxiliary memory fetches. 
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TABLE 3.12b : Batched 3 Heuristic (Wedge Distribution) 
(ratio of the retrieval to optimal cost) 


The asymptotic retrieval cost for the list of n elements 
with m buffers, whose retrieval probabilities satisfy the 
Wedge distribution. The cost is in terms of the optimal 
number of buffer lookups, and optimal number of auxiliary 
memory fetches. 
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IV. Conclusion 


We have shown how self-organizing heuristics can be 
adapted to a limited buffer size linear search system to 
improve performance. Through numerical examples and 
Simulation results, we have shown that the transposition 
heuristics are generally better than the corresponding move 
EOVErontheunustves.) Punthermoreserthestudy of ik an iiasrrow 
heuristics exihibits that better asymptotic performance can 


be attained at the expense of a few extra bits of storage. 


Ae related “application "che the wlimited, butter @size 
self-organizing scheme is in the construction of paging 
algorithms LOL memory management. After every page 
reference, the list of pages currently residing in the main 
memory can be reorganized. In case of a page fault, a page 
will be removed to make room for the new page, according to 
one of the limited-buffer self-organizing heuristics 
discussed so far. The "Least Recently Used" (LRU) paging 
algorithm corresponds to the modified simple move to front 
heuristic. Similarly. the transposition ule ors kain aerow” 
heuristics could be incorporated into the paging algorithms 


to improve performance. 


So “far. our study on the transpositiom hneupistic, for 
the limited-buffer linear search system has been limited to 
the study of numerical examples and simulation results, as 
the actual retrieval cost 1s “ditficulG ‘to, sderive. (it as 


hoped that in the future, the class of transposition 
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heuristics can be treated to more analytical study, so that 


more can be said of the performance in general. 


As for the conventional self-organizing heuristics, the 
convergence rate of the limited-buffer self-organizing 
heuristics cannot be overlooked. We have seen that, in the 
conventional system, although the transposition rule and the 
"Kein a tow” heuristics are asymptotically more efficrvent, 
their convergence rates are generally slower than the simple 
move to front heuristics. Therefore, the simple move to 
front heuristic may be a more favorable choice when there 
are fewer requests in the linear search system or when the 
access frequency changes from time to time. For this 
limited-buffer scheme, we may expect similarly Slower 
convergence to the asymptotic efficiency by the 
asymptotically more efficient transposition and the "k in a 
row" heuristics. Here, there are two performance measures of 
interest, namely expected number of buffer lookups, and the 
expected number of auxiliary memory fetches. The study of 
the relative convergence rate of the various heuristics 


remains an open problem. 


One major assumption of our study is the independence 
assumption, i.e., the retrieval probability of each record 
remains unchanged and independent at all times. When there 
is al high correlation between Successive requests, the class 
of move to front heuristics may outperform the corresponding 


transposition rule. Lam, Leung, and Siul11] examine the 
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Simple move to front heuristics when each access is assumed 
to be dependent on the access immediately preceding it. 
However, the study of transposition rule with dependent 


accesses still remains an open problem. 
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Appendix A. Expected Number of Auxiliary Memory Fetches 


The expected number of auxiliary memory fetches can be shown 


to be as follow 
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Given n records and a buffer size of m, the expected number 

of memory fetches is defined to be 
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Before proceeding to simplify the above expression, we shall 


first show that 
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Therefore, the expression (A.2) can be further reduced to 
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Appendix B. On the Markov Chain for the Modified Simple K 


MTF Heuristic 


In Chapter 3, we have shown that the limited buffer 
linear search system can be modelled as a Markov Chain, 
whereby the state is described by each of the possible 
COMELGUrationS “of “the butters tet ores bel onestof the "P,, 
possible states of the m-size butter, 7, € On. .bet 9 a) = 
[RaRtees.Rimi, and, let those records that are not. in the 
DURTeEr for this configuration be tebelled as ky un.a4> through 
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where the 2z previous requests are all for record R,. It is 
shown in Chapter 3 that the Markov Chain must satisfy the 


following equations 
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ComuCacan be written as 


The proot “will “be given “later, First, of all we shall 


present the following lemmas. 
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LEMMA B.17 
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Appendix C. On the Markov Chain for the Modified Batched K 


Heuristic 


Leta mt, = “i Roa Rie Rim ebewan particular state. of the 
Markov chain, where each state represents one particular 
configuration of the buffer, under the modified batched k 
Heuristic. Let the records that’ do not reside in buffer for 


this particular state 7, be denoted by Rm.; through R,. 


In Chapter 3, we have shown that the Markov Chain must 
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We shall study the two lemmas below before proceeding to the 
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Appendix D. Simulations 


In Chapters 2 and 3, a’ good approximation of the 
retrieval cost for any self-organizing heuristics is 
obtained by evaluating the probabilities of each state at 
SuccesSive times t, where state probabilities at time t-1 
ase ~Multiplived “by the transition matrix. <The long term 
probability of each state is approximated by the probability 
et large t= Let 1, = [R).Ryo...Rynl and propi ny) | denctes the 
Fong térm probability of state a;, the “retrieval “cost is 
Ben as follow 


Expected number of buffer lookups, u(m), 
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Expected number of auxiliary memory fetches, v(m), 


= 2 (iat = Toaoit dl Prob[7z, ] 
i = 1 


We can see that the above method yields a very close 
approximation to the actual retrieval cost, however, the 
shortcomings include the huge number of computations 
required in successive multiplication of the transition 
matrix, and the enormous amount of storage required to store 
the "P, states and the transition matrix. The other method 


of examining the behavior of various heuristics, though not 
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as accurate, 1s by simulation. The next two paragraphs 


describe how the simulation is done. 


To study the cost cf a seli-organizing heuristac. om sa 
linear search system with n elements and m buffers, the 
initial list is filled with m elements, where the m elements 
selected (between 1 and n) are randomly generated. The 
search requests are also produced by a random number 
generator, distributed according to the retrieval 
probabilities. Every time a search request is generated, the 
number of comparisons required to locate the requested 
element is recorded and the list is reorganized according to 
Eheminequured™ self-organizing @ruule.. The tretrievalwcost as 
then given by the average number of comparisons required to 
Satisfy a search request, and the average number of times a 


search request is not in the buffer. 


To generate search requests, we use a multiplicative 
congruential random number generator with period 23? to 
generate random numbers in the range from 0 to 2?'~'. This 
number is then divided by 2?' to give a real number’ between 
O and 1. Suppose each record i has a retrieval probability 
of p;, and Zi., p; = 1. If the random number generated lies 


inethe interval [4c Diy lieapip record jaw be requested. 


ImOumw Sumulabrons) s) each Guns consisted for eatienose 
1,000,000 searches. After every 5000 searches, the average 
retrieval cost per search request was recorded, and was 
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9000-search mark. If these two costs did not deviate by more 
thanee0.0000 1, the Simulation terminated. Otherwise, 
Simulation continued until the 1,000,000 searches are 
completed. The results tabulated in Tables D.1;, D.2 and D.3 
represent average retrieval costs over two Separate eruns, 
using different random sequences to generate the search 
requests. All simulation results obtained deviate from the 
average retrieval cost recorded in the Tables by less than 


Oher percent. 


The Simulation results ‘obtained will “not be too 
representative “of the actual” “Search cost, as the results 
greatly depend on the random numbers generated. One would 
normally expect some discrepency between the simulation 
results wand’ -the “actual ~cost. “The Simulation results 
generally demonstrate the superiority of the transposition 
heuristic over the corresponding move to front heuristic. 
However, the retrieval costs for the simple k heuristic and 
batched k heuristic are not very distinguishable. Referring 
back to the costs tabulated in Chapter 3, we find that when 
the number of elements gets larger, the retrieval costs’ for 
both the simple k and the corresponding batched k heuristics 
are quite similar, especially for the expected number of 
memory fetches. Therefore, the simulation results do not 
demonstrate any clear cut Superiority between the simple k 
and the corresponding batched k heuristic. However, the 
results do demonstrate the superiority of the k in ae row 


heuristics over the corresponding simple heuristics. 


: 
see eadnta4a Ig ue 
6.0 tie ©,.9 «hel 
‘yao: Sierags2 OMS 4806 stents 
fiscaed (s+ ateretap ot sas 
i-ag2s oie lWeo) bSRseseo cunt lagi ae 
feu a 
sdt @esf @t eeidaT 544 s ebetnted ? err 
ameorseg 
= 
je ad'odon Itiw’ Benlsddo’_smeueas inn i oniughe —@a? 
- 267 Si*S ef -6h.) AS2ege fuslse 7 16 @vb7eg ' 


a eSt end ge) 5a I6r Pt “0 ey sd ro Bods af? Su iraegab qi? 


fs@lumra, a outed Yonenwssete: eal Steqee «gh 


74 


Le ng. deivesr,” ent noc; <i miagsa ete trw Os 
sacqaeis 4¢3 45 apt) bb) ewe SRD  sPaataeesey iden 
i (tarGuarsoy ait.seve 2isedmeee 
a 
14 ol vames Jevnl We; ad? . ONE > 

R12 Di a bet Sy 1a dosage # tpltazed 

oe Jay! TP. <<qond-n;: beshledey adeen edie i 


a 


-ay@l9943 Se) Ppettal’ habe srdevte te te séoclieays tz 
‘wi witled er trkase iu ets “tae Sigeie sag) Lied 

imanun Cewteagew? 44¢ Wel) yi falseqee San reae osive . 
~Oneus .¢ seve noecthadimie, Sit ora err «donazet ie 
4 aigeie od3 *3ewand 49 )72 coegua, Jeo ewe) ee erase 
73). 20"syoK, stizlusty 2 Paozgad puibasgrerieg 
wens 16 Ont Ob eng! Sa ait steeie, 


2d set aname Ate 
.026Sec duced) wlawin ae 7™ oe: 


106 


Although this simulation study is not too representative of 
the actual retrieval cost, nevertheless, it does serve as an 
allernativemetoisstudy @thesebehaviorsor the self-organizing 
heuristics, when it is infeasible to use the enormous amount 
of computations and storage required to obtain a good 


approximation of the actual cost. 
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TABLE D.1 2 Simple Heuristics 


The simulated retrieval cost, for a list of n elements and m 
buffers, assuming the retrieval probabilities Satisfy Zipt's 
distribution. The cost is in terms of the u, the expected 
number of buffer lookups, and »v, the expected number of 
memory fetches. 
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TABLE SD Jess eoimoles2nHeurist ies 


The simulated retrieval cost, for a list of n elements and m 
buffers, assuming the retrieval probabilities satisfy Zipf's 
distribution. |The Bcost: /1ssinetermsomote thes, the expected 
number of buffer lookups, and v, the expected number of 
memory fetches. 
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TABLE D.3 : Batched 2 Heuristics 


The simulated retrieval cost, for a list of n elements and m 
buffers, assuming the retrieval probabilities satisfy Zipf's 
distribution... The “cost 96 an terms of the #, the expected 
number of buffer lookups, and v, the expected number of 
memory fetches. 
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